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REVIEW 

Neurodevelopmental disorders: mechanisms and boundary 
definitions from genomes, interactomes and proteomes 

AP Mullin 1 ' 6 , A Gokhale 1 ' 6 , A Moreno-De-Luca 2 , S Sanyal 1 ' 3 , JL Waddington 4 and V Faundez 1 ' 5 

Neurodevelopmental disorders such as intellectual disability, autism spectrum disorder and schizophrenia lack precise boundaries 
in their clinical definitions, epidemiology, genetics and protein-protein interactomes. This calls into question the appropriateness of 
current categorical disease concepts. Recently, there has been a rising tide to reformulate neurodevelopmental nosological entities 
from biology upward. To facilitate this developing trend, we propose that identification of unique proteomic signatures that can be 
strongly associated with patient's risk alleles and proteome-interactome-guided exploration of patient genomes could define 
biological mechanisms necessary to reformulate disorder definitions. 
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Neurodevelopmental disorders (NDDs) are multifaceted condi- 
tions characterized by impairments in cognition, communication, 
behavior and/or motor skills resulting from abnormal brain 
development. Intellectual disability, communication disorders, 
autism spectrum disorder (ASD), attention deficit/hyperactivity 
disorder (ADHD) and schizophrenia fall under the umbrella of 
NDD. 1-3 Currently, there are no biomarkers to diagnose NDD or to 
differentiate between them. Rather, these disorders are categor- 
ized into discrete disease entities, based on clinical presenta- 
tion. 1 This is problematic, as many symptoms are not unique to a 
single NDD, and several NDDs have clusters of symptoms in 
common. For example, impaired social cognition is common to 
ASD and schizophrenia, 4-7 and psychosis is observed not only in 
schizophrenia but also in those with bipolar disorder or major 
depressive disorder. 8,9 Thus, such overlap of clinical symptoms 
presents a challenge for nosology and course of treatment. This is 
in stark contrast to other disorders, such as cardiovascular 
diseases, where diagnosis is rooted in biological manifestations, 
biomarkers and pathophysiology. The diffuse clinical boundaries 
among NDD calls into question the appropriateness of current 
disease definitions. 10-13 Here, we advocate reformulating current 
nosological categories with novel disorder definitions rooted in 
the biology of processes that are awry in NDDs. We predict that 
biological disorder definitions will change the way we use 
symptomology for diagnosis. 



NEURODEVELOPMENTAL DISORDERS: BOUNDARY 
DEFINITIONS FROM GENOMES 

The hypothesis that NDDs are distinct nosological entities predicts 
that genetic factors associated with risk for or causation of a given 
disorder should segregate with diagnostic categories; thus, in 
classical terms, there should be little or no overlap among the 



genetic factors implicated in each NDD. That is, the genes that 
operate in one disorder should not be involved in another. 
However, genetic epidemiology reveals substantive overlap 
between genes conferring risk for or causing NDDs. 

Genetic defects associated with risk or causation of NDDs range 
from large chromosomal deletions to single-nucleotide polymorph- 
isms (SNPs). Notably, among major genomic defects, a number of 
chromosomal deletions are associated with intellectual disability, ASD 
and schizophrenia. 12,14-16 Among the most frequent are 1q21.1, 
1 6p1 1 .2 and 22q1 1.2. 12,17 The large number of genes affected by 
these deletions should cause little surprise that they give rise to 
disorders with overlapping phenotypes. However, smaller genetic 
modifications, specifically SNPs in non-coding regions, are shared 
among diverse NDDs. 18 Genetic overlap among NDDs extends to 
monogenic defects that affect the coding sequence and expression of 
a single polypeptide encoded by the gene (for example, 5HANK3, 
NRXN1, DISCI FMR1, MECP2, GPHN). Patients carrying these mutations 
are diagnosed either with intellectual disability, ASD, schizophrenia or 
combinations of thereof. 19-33 Monogenic genetic defects affect 
subunits of obligated and stable protein complexes. For example, 
the adaptor complex AP-3 is an obligate heterotetramer that 
generates vesicles from early endosomes bound to lysosomes/ 
synapses; 34,35 human mutations in a neuronal-specific AP-3 subunit 
(AP3B2) associate with ASD. 36,37 Thus, irrespective of the size of a 
genetic defect, there is a continuously expanding list of affected 
genes that do not respect categorical diagnostic boundaries 
among NDDs. 



NEURODEVELOPMENTAL DISORDERS: BOUNDARY 
DEFINITIONS FROM INTERACTOMES 

Protein interaction networks (interactomes) to which NDD genes 
belong also overlap. Interactomes built from genes associated 
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with intellectual disability, ASD, ADHD and schizophrenia con- 
verge on common molecular pathways. 38 Genes associated with 
these NDDs intersect on one out of 700 genes catalogued as risk 
factors. However, the list of common proteins shared by these 
NDDs increases to 147 out of the 700 genes simply by expanding 
the gene catalog to include predicted first-degree interacting 
neighbors obtained from protein-protein interaction data- 
bases. 38 These computational studies support the concept that 
the interactomes associated with NDDs overlap. However, the 
power of these types of studies is limited by the present quality of 
protein interaction databases, which are incomplete, are only 
moderately curated to accommodate newly published findings and 
are often populated by results not confirmed by alternative 
biochemical, genetic and/or functional approaches 39-41 Furthermore, 
protein interaction databases are biased by the experimental 
approach used in their generation; for example, most protein 
interaction databases poorly represent membrane proteins that 
are not amenable to exploration by traditional yeast two hybrid or 
pull-downs with recombinant proteins. 42 



The interactome of the schizophrenia susceptibility gene 
DTNBP1 well illustrates several of these problems (Figure 1). 
DTNBP1 encodes dysbindin, a subunit of the BLOC-1 complex. 43 " 51 
This complex participates in membrane protein trafficking 
between endosomes and lysosomes, and between endosomes 
located in neuronal cell bodies and the synapse. 35,50,51 Published 
in silico dysbindin interactomes 52,53 differ from biochemically and 
genetically tested protein interaction networks. 37 However, discre- 
pancies among interactomes expand beyond those published 
(Figures 1a and d). Three protein interaction databases report 
associations that differ from each other in interactor identities. 
Furthermore, feeding those associations into a rigorous algorithm 
for 'de novo' generation of interactomes reveals different network 
topologies (Figures 1a and d). 54 Only one of these four dysbindin 
interactomes links dysbindin with the adaptor complex AP-3, 
despite multiple biochemical, cell biology, and genetic evidence 
that these complexes interact in vivo and in vitro (Figure 1a) 55-62 
This deficiency in existing databases has immediate ramifications, 
as mutations in AP3B2 associated with ASD cannot be linked to 
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Figure 1. D77V£P7-dysbindin interactomes differ in their constituents and topology. Interactomes were assembled with the Dapple algorithm 
(http://www.broadinstitute.org/mpg/dapple/dapple.php) 54 using as inputs the dysbindin associated proteins identified by affinity 
chromatography (a), and interactors reported in three protein-protein interaction databases: (b) Biogrid (http://thebiogrid.org/), (c) 
Genemania (http://www.genemania.org/) and (d) String 9.05 (http://string.embl.de/). Red boxes highlight DTNBP1. Note that the identity of 
interacting proteins differs among interactomes. Color code represents a Dapple estimated probability that a protein would be as connected 
to other proteins (directly or indirectly) by chance as is depicted. Only interactome A presents a biochemically and genetically confirmed 
interaction between the adaptor complex AP-3 and the dysbindin-containing BLOC-1 complex. 
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schizophrenia through the BLOC-1 subunit dysbindin. ' AP3B2 is 
not an isolated instance. Rather, only the experimentally defined 
dysbindin interactome identifies 5NAP29 and CLTCL1. 37 SNAP29 
has been identified as a de novo risk factor for schizophrenia, while 
both SNAP29 and CLTCL1 map to the chromosome interval 
affected in velocardiofacial (chromosome 22q1 1 .2 deletion) 
syndrome. 17,63 This syndrome closely associates with schizophre- 
nia, ASD and intellectual disability. 17 Gaps in content and quality 
in relation to protein interaction databases are important, as these 
repositories are the foundation for molecular connectivity 
between genetic defects associated with a given disorder. These 
deficiencies are missed opportunities for establishing molecular 
mechanisms of disease and finding mechanistic commonalities 
among NDDs. Thus, we argue in favor of generating interactomes 
confirmed by biochemical, genetic and/or functional strategies. 
Epidemiological genomics offer the field a good selection of solid 
candidate genes with which to begin this quest. 



NEURODEVELOPMENTAL DISORDERS: 'GUILTY BY 
ASSOCIATION' MECHANISMS OF DISEASE AND THEIR 
INCLUSION IN INTERACTOMES 

Loss of one protein function due to a genetic mutation can alter 
levels or activity of other proteins that interact either directly or 
indirectly with the mutant protein. These 'guilty by association' 
proteins can be the actual culprits of disease phenotypes. The 
concept is illustrated readily by Marfan syndrome, a connective 
tissue disorder in which morbidity and mortality are chiefly 
associated with aortic aneurisms. 64,65 This disease is caused by 
mutations in the extracellular matrix protein fibrillin-1 {FBN1), 
which organizes into 10nm fibers. 64,65 An old pathogenic 
hypothesis considered that loss of fibrillin fibers decreased blood 
vessel resilience to mechanical stress 64,65 However, there is a new 
conceptualization of this syndrome that pinpoints abnormal TGF|3 
signal transduction as the main culprit in its vascular pathology. 
This unexpected shift can be understood from the fact that fibrillin 
binds and presents TGF(3 to its receptor at the optimal 
concentration, time and location 64-66 Thus, TGF(3 is a 'guilty by 
association with fibrillin' protein. 

Subunits of protein complexes are particularly susceptible to 
being 'guilty by association' proteins. Genetic defects, or even 
non-pathogenic allelic variation affecting a single subunit of a 
protein complex, frequently lead to downregulation and/or 
covariation of other complex subunits 67-72 DTNBP1 null mutations 
abrogating dysbindin expression downregulate most subunits of 
the BLOC-1 complex, despite the monogenic character of the 
mutation. 44,50,69 Reciprocally, genetic defects on other BLOC-1 
subunits decrease dysbindin cellular content. 44,50 'Guilty by 
association' proteins in the dysbindin interactome extend beyond 
intrinsic components of the BLOC-1 complex. These proteins 
include membrane protein cargoes such as VAMP7 {VAMP7), a 
synaptic vesicle fusogenic membrane protein (SNARE) implicated 
in spontaneous synaptic vesicle fusion and the Menkes disease 
copper transporter (ATP7A), the adaptor complex AP-3, RhoGEFI 
{ARHGEF1) and BDNF (BDNF), a neurotrophin with a long history of 
association with several NDDs. 57,59,73-76 None of these proteins 
whose levels are affected by mutations in DTNBP1, or other 
BLOC-1 complex subunits, can be identified in current protein 
interaction databases that focus on physical protein-protein 
interactions. This problem prevents their inclusion in any analysis 
seeking to connect genetic defects found in genome-wide 
associations studies to relevant molecular pathology. 



CREATING A NOSOLOGY FROM GENOME INFORMED 
PROTEOMES-INTERACTOMES 

Genome-wide association studies (GWAS) search the genomes of 
clinically defined patient populations for genetic markers that 
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reach a threshold of statistical significance to associate with disease 
risk (Figure 2a). This approach encounters the problem that these 
disorders are polygenic and that categorical NDD definitions are 
not linked to biological markers or molecular phenotypes. 77,78 
Thus, it is likely that genetically heterogeneous patient cohorts in 
these studies gather multiple molecular mechanisms of disease. 
However, these studies offer powerful insight when a particular 
genetic marker reaches statistical significance, despite the 'noise' 
introduced by the polygenic character of these disorders and the 
problems intrinsic to categorical NDD definitions. Genetic defects 
associated with one or multiple NDD should be seen as the tip of 
the iceberg to unravel biological mechanisms of disease. Interac- 
tomes of gene products consistently implicated in NDDs ('tip of 
the iceberg genes') are a fertile ground to search for disease 
mechanisms. 54,79 This prediction stems from the hypothesis that 
genomes of patients affected by polygenic NDD should concen- 
trate alleles that affect the expression or function of genes whose 
products belong to or modulate a relevant pathway. 80 We illustrate 
this concept in Figure 2b where gene-a has reached statistical 
significance in a population GWAS. The product encoded by gene-a 
is a bait to 'fish out' the red protein interaction network (Red 
interactome B1, Figure 2b). The biochemical definition of inter- 
actome 1 would occur irrespective of whether interactome 1 
contains products encoded by genes carrying defects that do not 
cross a population statistical threshold (Figure 2b). 

This genome to proteome 'reverse' approach is not foreign to 
current genomic studies, in which bioinformatics of protein- 
protein interaction databases are used to find connections 
between gene defects that associate with a disorder at a GWAS 
level 36,79 (Figure 2c). However, mapping GWAS results back to an 
interactome requires the availability of several network genes that 
cross a statistical threshold (Red interactome 1, Figure 2c) as well 
as pre-existing and reliable protein interaction databases. Genes 
below statistical threshold in the red network C1 would not 
contribute to the identification of the C1 interactome (Red 
interactome CI, Figure 2c). Moreover, current criteria to allocate 
GWAS results to an interactome would miss the yellow 
interactome C2 where genes encoding interactome products are 
all below statistical threshold (Figure 2c). 

How can we obtain mechanistic insight from studying 'omes? 
We propose two non-exclusive approaches to define the biology 
of NDDs using protein-protein interaction networks and geno- 
mics. The first approach is through the definition of 'tip of the 
iceberg gene' protein networks, such as those depicted by the red 
interactomes in Figures 2b and c. Second, reliable protein 
interactomes can be used as a query matrix to explore patient's 
genomes for genetic defects or variants targeting interactome- 
encoding loci. Different patients may carry defects in one or more 
genes encoding products belonging to an interactome. Each gene 
defect does not reach statistical significance in a 'gene-centric' 
GWAS study (Subject 1-3, Figure 2d). However, collective analysis 
of the genomes in a cohort of patients (Subject 1-3, Figure 2d) 
shows significant enrichment of genetic defects clustered on a 
common pathway (compare red and yellow interactomes, Figure 2e). 
The association of a biological mechanism, defined by an already 
known and reliable interactome, with the genome of affected 
individuals would occur, although each gene in isolation would 
have not risen above statistical threshold. In this case, statistical 
significance is assigned to a collection of genes defining an 
interaction network rather than a single gene (Figure 2e). 

These solutions depend on reliable protein interactions net- 
works. As mentioned above, the quality of protein-protein 
interaction databases commonly used is substandard. This is 
due to a lack of thorough biochemical, functional and/or genetic 
confirmation of interactions. We posit that it is possible to extract 
more information about disease mechanisms and disorder 
boundaries from current GWAS studies if reliable protein 
interaction maps were to exist. As these are either not available 
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Figure 2. Models of cross-fertilization between genomes, proteomes and interactomes. Grid in diagrams (a) to (e) depicts a polygenic genetic 
landscape associated with a NDD. Circles represent defined genes within the grid that when affected in different combinations trigger a NDD. 
Bars above each gene indicate a subject where a gene defect was found on a GWAS. Blue bars are those subjects that have a defect in a gene 
below statistical threshold, which is marked by the asterisk in (a). Red bars above a gene represent subjects that have a defect in a gene above 
statistical threshold, (b) Depicts a 'tip of the iceberg gene a' and the network to which it belongs represented by the connected red circles 
(interactome 1). (c) Depicts three 'tip of the iceberg genes' and the network to which they belong (interactome 1). The yellow interactome 2 is 
constituted by genes below statistical threshold as defined by gene-centric GWAS statistical analysis, (d) Represents genetic defects (blue bars) 
in two interactomes per patient (subjects 1-3). Note that in all patients there are no gene defects in the red interactome. E depicts 
hypothetical results of an interactome-centric GWAS that includes subjects 1-3 in (d). The yellow interactome 2 is now above statistical 
threshold as defined by an interactome-centric GWAS statistical analysis. See text for details. 



or they are in construction, we propose to focus efforts on 
defining the interactomes of (a) NDDs 'tip of the iceberg genes' as 
well as (b) 'guilty by association' proteins detected in the 
proteomes of cells carrying genetic defects in 'tip of the iceberg 
genes'. These and other experimentally confirmed interactomes 
(yellow interactome 2 in Figure 2e) would allow us to extract novel 
genetic information from existing and future GWAS. 

CREATING A GENOME-INDEPENDENT NOSOLOGY FROM 
PROTEOMES-INTERACTOMES 

Human proteomes are hereditable molecular phenotypes 72 and as 
such constitute valuable, yet untapped, resources to create 
disorder classifications rooted in molecules and their pathways. 
The study of proteomes shares with the analysis of genomes its 
quantitative and unbiased character. However, proteomes and 
interactomes offer the distinctive advantage of being executors of 
phenotypic programs in cells and tissues. Therefore, proteomes 
and interactomes are causally closer to the identity of disease 
mechanisms than genomes. Proteomes are already beginning 
to shed light on complex neurological disorders such as 
schizophrenia. 81,82 However, we should not limit ourselves to just 
exploring postmortem brains of subjects grouped solely by their 
clinical features. Instead, we advocate for the study of proteomes 
from cells isolated from individuals that are genetically related. 



Cell proteomes from affected probands compared with their 
unaffected first-degree relatives offer a great prospect for the 
identification of hereditable or de novo abnormalities in molecular 
phenotypes. Evidently, in the context of NDDs, human inducible 
pluripotent stem cells are a great resource, as they can be 
differentiated into neurons. 83 However, it is likely that the 
molecular mechanisms affected in NDDs are common to many, 
if not all cells. For example, Fragile X syndrome or velocardiofacial 
syndrome, where multiple tissues are affected 12 . Thus, fibroblasts 
or lymphoblasts from human pedigrees are likely to offer valuable 
insights into neuronal disorders. We predict that proteomes built 
from genetically related subjects' cells will bridge two camps. On 
one hand, proteomes will help us to interpret results from 
genome-wide analyses. On the other hand, they will guide us to 
define NDD mechanisms at levels of complexity higher than the 
traditional single genes or proteins. These would include, for 
instance, subcellular compartments, such as synapses or mito- 
chondria, and deficits in tissue organization, such as those in 
neural circuits. Genomes, proteomes and interactomes give us 
vantage points, the inevitable next step is to dive deep into the 
biology emerging from and converging to them. 
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